Conformational Control of Fast Asparagine Deamidation in a Norovirus Capsid Protein

Accelerated spontaneous deamidation of asparagine 373 and subsequent conversion into an isoaspartate has been shown to attenuate the binding of histo blood group antigens (HBGAs) to the protruding domain (P-domain) of the capsid protein of a prevalent norovirus strain (GII.4). Here, we link an unusual backbone conformation of asparagine 373 to its fast site-specific deamidation. NMR spectroscopy and ion exchange chromatography have been used to monitor the deamidation reaction of P-domains of two closely related GII.4 norovirus strains, specific point mutants, and control peptides. MD simulations over several microseconds have been instrumental to rationalize the experimental findings. While conventional descriptors such as available surface area, root-mean-square fluctuations, or nucleophilic attack distance fail as explanations, the population of a rare syn-backbone conformation distinguishes asparagine 373 from all other asparagine residues. We suggest that stabilization of this unusual conformation enhances the nucleophilicity of the backbone nitrogen of aspartate 374, in turn accelerating the deamidation of asparagine 373. This finding should be relevant to the development of reliable prediction algorithms for sites of rapid asparagine deamidation in proteins.

S 2 Figure S1. Residuals from fitting of the deamidation reaction rate in Fig. 2c. Fitting of experimental IEX data to the numerical solution of the system of differential equations describing simultaneous deamidation and dimer reassembly allows for determination of the deamidation rate k1 but not of the association rate kon. Independent of kon, variation of k1 leads to a narrow minimum in squared residual of the fit at 4.5×10 −7 s −1 . Accordingly, kon cannot be determined from curve fitting of the IEX data and its variation in the range 10 3 -10 6 M −1 s −1 has no discernible effect on the solution.   15 N TROSY HSQC spectra. The spectrum of the fresh wild type protein is shown in black, of the point mutant N373D in red, and of deamidated wild type protein containing isoAsp in grey. Figure S6. IEX behavior of the VA387 N373D mutant. Overlay of IEX chromatograms of partially deamidated wild type GII.4 VA387 (black) and the N373D mutant (red). N373D elutes at a different volume than all naturally occurring species and is, therefore, no deamidation reaction product. Figure S7. Representative IEX chromatograms of VA387 wild type and two mutant proteins N372E and H297R (cf. Fig. 3).  15 N]-labeled P-domains causes Euclidean chemical shift perturbations (CSPs) that reflect binding site occupation at different total ligand concentrations. c) CSPs larger than the mean + 2 standard deviations at the highest ligand concentration were used for global fitting against the law of mass action to yield the dissociation constant KD. The affinity towards fucose is identical to that found previously for GII.4 Saga P-domains 1 . Figure S9. Monitoring of the deamidation of a 13-mer peptide harboring the sequence of the deamidation loop of the GII.4 VA387 NoV P-domain with 2D NMR spectroscopy. Chemical shift assignments are given in Tab. S4. S 8 Figure S10. TOCSY spectra of GII.4 Saga and VA387 model peptides before and after incubation at 37 °C. Asn decay rates were quantified based on the respective HB signal intensities. Chemical shift assignments are given in Tab. S3 and S4.

S 9
Figure S11. a) Free energy maps of the backbone torsion angles j and y for all Asn residues in the SAGA P-dimer. b) Scatter plot of the backbone torsion angles, in which the color codes the nucleophilic attack distances. The color scales apply to all panels. The data was pooled from 5 individual replica simulations of 1 µs sampling time each. The backbone syn conformation belongs to φ, ψ values of around -180°, 0°. It is well populated an allows close distances. a b S 10 Figure S12. a) Free energy maps of the sidechain torsion angles c1 and c2 for all Asn residues in the SAGA P-dimer. b) Scatter plot of the side chains torsion angles, in which the color codes the nucleophilic attack distances. The color scales apply to all panels. The data was pooled from 5 individual replica simulations of 1 µs sampling time each. Short attack distances are frequently observed for c2 angles of ~60° and ~180°.  S19: Interaction (heavy-atom distance < 0.55 nm) and hydrogen bond (donor hydrogen to acceptor distance ≤ 0.3 nm, angle ≥ 150°) occupancies between selected residues of VA387 a) Only sidechain atoms are considered. b) All atoms are considered. c) Hydrogen bonds identified in chain A of the P-dimer, averaged over the five replicates. d) Hydrogen bonds identified in chain B of the Pdimer, averaged over the five replicates. Hydrogen bonds are only shown for occupancies ≥ 0.1.

Supplementary Material and Methods Peptide Synthesis
Materials: Chemical reagents and solvents for the peptide syntheses were of peptide-synthesis grade; solvents for HPLC and spectroscopy were of HPLC or spectroscopy grade. Fmoc-protected amino acids, Methods: Solid-phase peptide synthesis was carried out on an automatic peptide synthesizer (Syro I, Biotage). The analytical HPLC equipment was from Thermo Fisher Scientific (Ultimate 3000). The analytical column was from Thermo Fisher Scientific (Syncronis C18 , 4.6x250 mm). The gradient used for analytical HPLC was the following: 3% B for 8 min, up to 60% B over 35 min (A = H2O with 0.06% TFA; B = CH3CN with 0.05% TFA). MALDI-TOF mass spectra were recorded on an Autoflex mass spectrometer from Bruker Daltonics using αcyano-4-hydroxycinnamic acid as matrix.  Figure S20. Analytical HPLC of the synthetic peptides used in this work (see Table S7 for tR values and gradient).  Table S7 for Mtheor.).

Supplementary Methods Molecular Dynamics
Theoretical conformational sampling was achieved using explicit-solvent, full-atomistic equilibrium molecular dynamics. Two molecular systems were subjected to molecular dynamics integration: 1. the P-protein dimer from strain GII.4 SAGA, and 2. the P-protein dimer from strain VA387. For both systems, data were collected from five trajectory replica of 1 µs length each, which were individually equilibrated using different initial velocity distributions. One Saga MD 1 μs trajectory was used previously to sample protein conformations for ensemble docking to the bile acid binding grove 2 . The SAGA P-dimer simulations also contained 5 molecules of bile acid salt, which did not form any stable interactions with the P-dimers. In both P-dimers, histidine moieties were protonated according to possible hydrogen bond formations with neighboring amino acids. In particular, histidine residues 292, 347, 417, 460, and 501 were protonated at the Nϵ position, histidines 378, 396, 414, and 490 at the Nδ. For histidine 505 we chose to protonate at both Nϵ and Nδ.
For the SAGA P-dimer, the molecular simulation tasks were performed with GROMACS 5.1.5 3-8 using CHARMM36 force field parameters 9 . Modeling of the initial system was attained with CHARMM-GUI solvation builder 10 using the X-ray structure PDB 4OOX 11 . CHARMM-GUI was also used for solvation with TIP3P water and ionization to 0.15 M NaCl. The periodic simulation box was set to a cubic shape of 9.3x9.3x9.3 nm³ volume, corresponding to 2 nm water layers in each direction around the central protein dimer. Prior to dynamics integration, the system was minimized for 5000 steps using a steepest descent algorithm. Dynamics were initiated by assigning velocities according to a Maxwell-Boltzmann distribution at 303.15 K, followed by NVT equilibration for 200 ps (time step 0.002 ps) using Nose-Hoover 12-14 temperature coupling (coupling constant 0.4 ps-1, reference temperature 303.15 k). Protein and solvent were coupled to individual baths. To relax the box volume, 200 ps of NPT (time step 0.002 ps) sampling using an isotropic Berendsen coupling 15-16 with a reference pressure of 1 ATM and a compressibility of 4.5e-5 ATM -1 were attached. The box volume was adjusted every 0.5 ps. During minimization and equilibration, the backbone atoms were restrained by 400 kJ/mol/nm and the sidechain atoms by 40 kJ/mol/nm. For the 1 µs of unrestrained production (time step 0.002 ps), Parinello-Rahman pressure coupling [17][18] was applied instead and the temperature coupling constant was increased to 2 ps. Snapshots were stored every 20 ps. During all the steps, covalent bonds to hydrogen atoms were constraint using LINCS 19-20 as a solver. Coulombic interactions were computed using the PME method 21-23 and a cutoff of 1.2 nm. Van-der-Waals interactions had a cutoff 1.2 nm with a force-switch modifier starting at 1.0 nm. Center of mass movement of the whole system was removed every 100 steps.
The simulation protocol was marginally updated for the VA387 P-dimer MD calculations. In particular, we here used GROMACS 2018.3 and discarded the NPT equilibration step because the long simulation time renders the initial few nanoseconds of box size equilibration negligible. However, we used an initial NVT equilibration of 125.000 steps (0.001 ps time step). Additionally, we applied restraints to the protein dihedrals during the equilibration (4.0 kJ/mol/deg). Also, the solvation box size was slightly smaller (9.2x9.2x9.2 nm³). Other parameters were identical. We note that the SAGA calculations were performed on multiple CPU nodes, whereas the VA387 simulations were achieved using single CUDA GPU nodes. However, we do not expect the adjustments to affect the overall outcome of our calculations. Regarding the length of the trajectories and the substantial computational effort, we justify the adjustments by a better utilization of compute resources.
Data analysis and visualization were carried out with GROMACS tools and the Python libraries NumPy 24-25 , MDTraj 26 and MatPlotLib 27 . The root mean squared fluctuation (RMSF) was computed using gmx rmsf. Only the backbone atoms C, N, CA and O were considered. Rotational und translational motions were removed using least-square super positioning of the backbone atoms. The per-residue solvent accessible surface area (SASA) was computed with gmx sasa and probe radius of 0.14 nm and 24 dots. To calculate the relative surface accessibility of the Asn residues, we divided their absolute SASA by 1.95 nm², corresponding to the theoretical maximum SASA for Asn 28 . The sidechain torsion angles of the Asn residues, as well as the distances from the Cγ atoms of Asn to the backbone nitrogen atoms of the subsequent amino acids were computed with MDTraj. The Asn torsion angles are defined as: φ: Ci-1-Ni-Cai-Ci, ψ: Ni-Cai-Ci-Ni+1, χ1: N-Ca-Cb-Cg, and χ2: Ca-Cb-Cg-Od (Figure 22 a). The free energy maps were constructed from the 2D probability densities as estimated by binning the data to 100 x 100 bins of 2π/100 widths. The relative free energies in units of kBT are computed as the negative natural logarithm of the probability density. Clustering was performed in the cosine-sine feature space spanned by the transformation of the four torsion angles z(φ)=[cos (φ), sin(φ)]. Hierarchical densitybased clustering was calculated using the HDBSAN 29 algorithm with cluster selection alpha value of 0.5, a minimum sample size of 100 and a minimum cluster size of 100 (other parameters were left default).
The Bürgi-Dunitz (BD) 30 and Flipping-Lodge (FL) 31 angles were calculated to better describe the nucleophilic attack geometry ( Figure S22b). They are based on a coordinate transformation that centers the carbonyl carbon to the origin and the carbonyl plane onto the XY-plane. Then, the BD angle is defined as the angle between the vectors connecting the carbonyl carbon with carbonyl oxygen and the carbonyl carbon with nucleophile. It can be described as the as altitude angle of the nucleophile when the electrophile (carbonyl carbon) is the reference. It is between 0 and 180°. The FL angle can be imaged as the inclination angle of the nucleophile relative to the normal of the carbonyl plane. Here, it is calculated as the pseudo-torsion angle between the carbonyl plane normal vector, the carbonyl carbon to carbonyl vector and the carbonyl carbon to nucleophile angle. It defined in a way that it is positive for a rotation towards the Cβ and negative towards Nδ2. Its range is between -180° and 180°.